Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 435 | 432 |
| Missing cells (%) | 8.1% | 8.1% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 88 (19.7%) missing values | Age has 88 (19.7%) missing values | Missing |
Cabin has 346 (77.6%) missing values | Cabin has 344 (77.1%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 309 (69.3%) zeros | SibSp has 298 (66.8%) zeros | Zeros |
Parch has 336 (75.3%) zeros | Parch has 332 (74.4%) zeros | Zeros |
Fare has 6 (1.3%) zeros | Fare has 5 (1.1%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-01-08 15:05:20.561676 | 2024-01-08 15:05:24.828375 |
| Analysis finished | 2024-01-08 15:05:24.827309 | 2024-01-08 15:05:28.722837 |
| Duration | 4.27 seconds | 3.89 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 447.11659 | 447.1704 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| Maximum | 890 | 891 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| 5-th percentile | 37.25 | 40.5 |
| Q1 | 239.25 | 217.25 |
| median | 446.5 | 433 |
| Q3 | 666.75 | 689.5 |
| 95-th percentile | 841.75 | 856.75 |
| Maximum | 890 | 891 |
| Range | 889 | 890 |
| Interquartile range (IQR) | 427.5 | 472.25 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 254.98656 | 267.31341 |
| Coefficient of variation (CV) | 0.57029099 | 0.5977887 |
| Kurtosis | -1.1483252 | -1.2855882 |
| Mean | 447.11659 | 447.1704 |
| Median Absolute Deviation (MAD) | 212.5 | 233.5 |
| Skewness | -0.02202314 | 0.031052275 |
| Sum | 199414 | 199438 |
| Variance | 65018.148 | 71456.461 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 546 | 1 | 0.2% |
| 849 | 1 | 0.2% |
| 868 | 1 | 0.2% |
| 516 | 1 | 0.2% |
| 888 | 1 | 0.2% |
| 232 | 1 | 0.2% |
| 533 | 1 | 0.2% |
| 421 | 1 | 0.2% |
| 334 | 1 | 0.2% |
| 543 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 746 | 1 | 0.2% |
| 376 | 1 | 0.2% |
| 204 | 1 | 0.2% |
| 535 | 1 | 0.2% |
| 243 | 1 | 0.2% |
| 57 | 1 | 0.2% |
| 819 | 1 | 0.2% |
| 737 | 1 | 0.2% |
| 195 | 1 | 0.2% |
| 555 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 18 | 1 | |
| 19 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 18 | 1 | |
| 19 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 0 |
| 2nd row | 0 | 0 |
| 3rd row | 1 | 0 |
| 4th row | 0 | 0 |
| 5th row | 1 | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 181 |
| Value | Count | Frequency (%) |
| 0 | 296 | |
| 1 | 150 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 181 |
| Value | Count | Frequency (%) |
| 0 | 296 | |
| 1 | 150 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 181 |
| Value | Count | Frequency (%) |
| 0 | 296 | |
| 1 | 150 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 446 |
| Value | Count | Frequency (%) |
| Decimal Number | 446 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 181 |
| Value | Count | Frequency (%) |
| 0 | 296 | |
| 1 | 150 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 446 |
| Value | Count | Frequency (%) |
| Common | 446 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 181 |
| Value | Count | Frequency (%) |
| 0 | 296 | |
| 1 | 150 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 446 |
| Value | Count | Frequency (%) |
| ASCII | 446 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 181 |
| Value | Count | Frequency (%) |
| 0 | 296 | |
| 1 | 150 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 1 |
| 2nd row | 3 | 2 |
| 3rd row | 3 | 1 |
| 4th row | 3 | 3 |
| 5th row | 2 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 104 | |
| 2 | 99 |
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 108 | |
| 2 | 86 | 19.3% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 104 | |
| 2 | 99 |
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 108 | |
| 2 | 86 | 19.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 104 | |
| 2 | 99 |
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 108 | |
| 2 | 86 | 19.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 446 |
| Value | Count | Frequency (%) |
| Decimal Number | 446 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 104 | |
| 2 | 99 |
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 108 | |
| 2 | 86 | 19.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 446 |
| Value | Count | Frequency (%) |
| Common | 446 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 104 | |
| 2 | 99 |
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 108 | |
| 2 | 86 | 19.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 446 |
| Value | Count | Frequency (%) |
| ASCII | 446 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 104 | |
| 2 | 99 |
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 108 | |
| 2 | 86 | 19.3% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 67 | 57 |
| Median length | 50 | 46 |
| Mean length | 26.526906 | 26.737668 |
| Min length | 12 | 13 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 11831 | 11925 |
| Distinct characters | 60 | 59 |
| Distinct categories | 7 | 7 ? |
| Distinct scripts | 2 | 2 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Nicholson, Mr. Arthur Ernest | Crosby, Capt. Edward Gifford |
| 2nd row | Nosworthy, Mr. Richard Cater | Norman, Mr. Robert Douglas |
| 3rd row | de Mulder, Mr. Theodore | Robbins, Mr. Victor |
| 4th row | Asplund, Master. Clarence Gustaf Hugo | Petterson, Mr. Johan Emil |
| 5th row | Richards, Master. George Sibley | Andersen-Jensen, Miss. Carla Christine Nielsine |
| Value | Count | Frequency (%) |
| mr | 264 | 14.7% |
| miss | 98 | 5.5% |
| mrs | 60 | 3.3% |
| william | 30 | 1.7% |
| henry | 21 | 1.2% |
| master | 20 | 1.1% |
| john | 20 | 1.1% |
| thomas | 15 | 0.8% |
| james | 14 | 0.8% |
| mary | 12 | 0.7% |
| Other values (876) | 1243 |
| Value | Count | Frequency (%) |
| mr | 271 | 15.1% |
| miss | 97 | 5.4% |
| mrs | 53 | 2.9% |
| william | 31 | 1.7% |
| john | 22 | 1.2% |
| henry | 21 | 1.2% |
| master | 17 | 0.9% |
| james | 13 | 0.7% |
| george | 12 | 0.7% |
| edward | 10 | 0.6% |
| Other values (873) | 1252 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1351 | 11.4% | |
| r | 974 | 8.2% |
| e | 872 | 7.4% |
| a | 814 | 6.9% |
| s | 655 | 5.5% |
| i | 653 | 5.5% |
| n | 645 | 5.5% |
| M | 572 | 4.8% |
| l | 514 | 4.3% |
| o | 459 | 3.9% |
| Other values (50) | 4322 |
| Value | Count | Frequency (%) |
| 1354 | 11.4% | |
| r | 972 | 8.2% |
| e | 856 | 7.2% |
| a | 813 | 6.8% |
| i | 672 | 5.6% |
| n | 661 | 5.5% |
| s | 643 | 5.4% |
| M | 558 | 4.7% |
| l | 533 | 4.5% |
| o | 495 | 4.2% |
| Other values (49) | 4368 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7585 | |
| Uppercase Letter | 1811 | 15.3% |
| Space Separator | 1351 | 11.4% |
| Other Punctuation | 949 | 8.0% |
| Open Punctuation | 66 | 0.6% |
| Close Punctuation | 66 | 0.6% |
| Dash Punctuation | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| Lowercase Letter | 7695 | |
| Uppercase Letter | 1806 | 15.1% |
| Space Separator | 1354 | 11.4% |
| Other Punctuation | 945 | 7.9% |
| Close Punctuation | 60 | 0.5% |
| Open Punctuation | 60 | 0.5% |
| Dash Punctuation | 5 | < 0.1% |
Most frequent character per category
Space Separator
| Value | Count | Frequency (%) |
| 1351 |
| Value | Count | Frequency (%) |
| 1354 |
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 974 | |
| e | 872 | |
| a | 814 | |
| s | 655 | |
| i | 653 | |
| n | 645 | |
| l | 514 | 6.8% |
| o | 459 | 6.1% |
| t | 321 | 4.2% |
| h | 261 | 3.4% |
| Other values (16) | 1417 |
| Value | Count | Frequency (%) |
| r | 972 | |
| e | 856 | |
| a | 813 | |
| i | 672 | |
| n | 661 | |
| s | 643 | |
| l | 533 | 6.9% |
| o | 495 | 6.4% |
| t | 322 | 4.2% |
| d | 252 | 3.3% |
| Other values (16) | 1476 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 572 | |
| A | 119 | 6.6% |
| H | 106 | 5.9% |
| J | 103 | 5.7% |
| E | 86 | 4.7% |
| S | 85 | 4.7% |
| B | 82 | 4.5% |
| C | 72 | 4.0% |
| W | 71 | 3.9% |
| R | 62 | 3.4% |
| Other values (15) | 453 |
| Value | Count | Frequency (%) |
| M | 558 | |
| A | 125 | 6.9% |
| H | 113 | 6.3% |
| J | 98 | 5.4% |
| S | 95 | 5.3% |
| E | 84 | 4.7% |
| C | 83 | 4.6% |
| W | 69 | 3.8% |
| D | 62 | 3.4% |
| B | 60 | 3.3% |
| Other values (15) | 459 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 446 | |
| , | 446 | |
| " | 50 | 5.3% |
| ' | 6 | 0.6% |
| / | 1 | 0.1% |
| Value | Count | Frequency (%) |
| , | 446 | |
| . | 446 | |
| " | 50 | 5.3% |
| ' | 3 | 0.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 66 |
| Value | Count | Frequency (%) |
| ( | 60 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 66 |
| Value | Count | Frequency (%) |
| ) | 60 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 3 |
| Value | Count | Frequency (%) |
| - | 5 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9396 | |
| Common | 2435 | 20.6% |
| Value | Count | Frequency (%) |
| Latin | 9501 | |
| Common | 2424 | 20.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1351 | ||
| . | 446 | 18.3% |
| , | 446 | 18.3% |
| ( | 66 | 2.7% |
| ) | 66 | 2.7% |
| " | 50 | 2.1% |
| ' | 6 | 0.2% |
| - | 3 | 0.1% |
| / | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1354 | ||
| , | 446 | 18.4% |
| . | 446 | 18.4% |
| ) | 60 | 2.5% |
| ( | 60 | 2.5% |
| " | 50 | 2.1% |
| - | 5 | 0.2% |
| ' | 3 | 0.1% |
Latin
| Value | Count | Frequency (%) |
| r | 974 | 10.4% |
| e | 872 | 9.3% |
| a | 814 | 8.7% |
| s | 655 | 7.0% |
| i | 653 | 6.9% |
| n | 645 | 6.9% |
| M | 572 | 6.1% |
| l | 514 | 5.5% |
| o | 459 | 4.9% |
| t | 321 | 3.4% |
| Other values (41) | 2917 |
| Value | Count | Frequency (%) |
| r | 972 | 10.2% |
| e | 856 | 9.0% |
| a | 813 | 8.6% |
| i | 672 | 7.1% |
| n | 661 | 7.0% |
| s | 643 | 6.8% |
| M | 558 | 5.9% |
| l | 533 | 5.6% |
| o | 495 | 5.2% |
| t | 322 | 3.4% |
| Other values (41) | 2976 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11831 |
| Value | Count | Frequency (%) |
| ASCII | 11925 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1351 | 11.4% | |
| r | 974 | 8.2% |
| e | 872 | 7.4% |
| a | 814 | 6.9% |
| s | 655 | 5.5% |
| i | 653 | 5.5% |
| n | 645 | 5.5% |
| M | 572 | 4.8% |
| l | 514 | 4.3% |
| o | 459 | 3.9% |
| Other values (50) | 4322 |
| Value | Count | Frequency (%) |
| 1354 | 11.4% | |
| r | 972 | 8.2% |
| e | 856 | 7.2% |
| a | 813 | 6.8% |
| i | 672 | 5.6% |
| n | 661 | 5.5% |
| s | 643 | 5.4% |
| M | 558 | 4.7% |
| l | 533 | 4.5% |
| o | 495 | 4.2% |
| Other values (49) | 4368 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.7040359 | 4.6726457 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2098 | 2084 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | male |
| 2nd row | male | male |
| 3rd row | male | male |
| 4th row | male | male |
| 5th row | male | female |
Common Values
| Value | Count | Frequency (%) |
| male | 289 | |
| female | 157 |
| Value | Count | Frequency (%) |
| male | 296 | |
| female | 150 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 289 | |
| female | 157 |
| Value | Count | Frequency (%) |
| male | 296 | |
| female | 150 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2098 |
| Value | Count | Frequency (%) |
| Lowercase Letter | 2084 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2098 |
| Value | Count | Frequency (%) |
| Latin | 2084 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2098 |
| Value | Count | Frequency (%) |
| ASCII | 2084 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 73 | 74 |
| Distinct (%) | 20.4% | 20.7% |
| Missing | 88 | 88 |
| Missing (%) | 19.7% | 19.7% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.365922 | 29.526536 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.67 | 0.42 |
| Maximum | 74 | 80 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.67 | 0.42 |
| 5-th percentile | 4 | 4 |
| Q1 | 20 | 19.25 |
| median | 28 | 28 |
| Q3 | 37.75 | 38 |
| 95-th percentile | 58 | 57.15 |
| Maximum | 74 | 80 |
| Range | 73.33 | 79.58 |
| Interquartile range (IQR) | 17.75 | 18.75 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.59244 | 15.003293 |
| Coefficient of variation (CV) | 0.49691747 | 0.50812913 |
| Kurtosis | 0.17882939 | 0.3822428 |
| Mean | 29.365922 | 29.526536 |
| Median Absolute Deviation (MAD) | 8 | 9 |
| Skewness | 0.429838 | 0.50290276 |
| Sum | 10513 | 10570.5 |
| Variance | 212.93929 | 225.09881 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 30 | 15 | 3.4% |
| 22 | 14 | 3.1% |
| 21 | 14 | 3.1% |
| 19 | 13 | 2.9% |
| 24 | 13 | 2.9% |
| 18 | 13 | 2.9% |
| 25 | 12 | 2.7% |
| 29 | 11 | 2.5% |
| 28 | 11 | 2.5% |
| 27 | 11 | 2.5% |
| Other values (63) | 231 | |
| (Missing) | 88 | 19.7% |
| Value | Count | Frequency (%) |
| 25 | 16 | 3.6% |
| 19 | 15 | 3.4% |
| 18 | 13 | 2.9% |
| 28 | 13 | 2.9% |
| 16 | 13 | 2.9% |
| 24 | 12 | 2.7% |
| 27 | 12 | 2.7% |
| 22 | 12 | 2.7% |
| 26 | 11 | 2.5% |
| 32 | 11 | 2.5% |
| Other values (64) | 230 | |
| (Missing) | 88 | 19.7% |
| Value | Count | Frequency (%) |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.83 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 4 | |
| 4 | 6 | |
| 5 | 3 | |
| 7 | 2 | 0.4% |
| 8 | 3 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.83 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 6 | |
| 3 | 1 | 0.2% |
| 4 | 5 | |
| 5 | 4 | |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.83 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 6 | |
| 3 | 1 | 0.2% |
| 4 | 5 | |
| 5 | 4 | |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.83 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 4 | |
| 4 | 6 | |
| 5 | 3 | |
| 7 | 2 | 0.4% |
| 8 | 3 |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.52017937 | 0.56502242 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 309 | 298 |
| Zeros (%) | 69.3% | 66.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 2 | 3 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.1310911 | 1.156918 |
| Coefficient of variation (CV) | 2.1744252 | 2.0475612 |
| Kurtosis | 18.17709 | 16.229817 |
| Mean | 0.52017937 | 0.56502242 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.763717 | 3.5364268 |
| Sum | 232 | 252 |
| Variance | 1.2793672 | 1.3384592 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 309 | |
| 1 | 99 | 22.2% |
| 2 | 16 | 3.6% |
| 4 | 11 | 2.5% |
| 3 | 5 | 1.1% |
| 8 | 4 | 0.9% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 104 | 23.3% |
| 2 | 20 | 4.5% |
| 4 | 10 | 2.2% |
| 3 | 7 | 1.6% |
| 8 | 4 | 0.9% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 309 | |
| 1 | 99 | 22.2% |
| 2 | 16 | 3.6% |
| 3 | 5 | 1.1% |
| 4 | 11 | 2.5% |
| 5 | 2 | 0.4% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 104 | 23.3% |
| 2 | 20 | 4.5% |
| 3 | 7 | 1.6% |
| 4 | 10 | 2.2% |
| 5 | 3 | 0.7% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 104 | 23.3% |
| 2 | 20 | 4.5% |
| 3 | 7 | 1.6% |
| 4 | 10 | 2.2% |
| 5 | 3 | 0.7% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 309 | |
| 1 | 99 | 22.2% |
| 2 | 16 | 3.6% |
| 3 | 5 | 1.1% |
| 4 | 11 | 2.5% |
| 5 | 2 | 0.4% |
| 8 | 4 | 0.9% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 6 | 6 |
| Distinct (%) | 1.3% | 1.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.38340807 | 0.43049327 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 5 | 5 |
| Zeros | 336 | 332 |
| Zeros (%) | 75.3% | 74.4% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 1 |
| 95-th percentile | 2 | 2 |
| Maximum | 5 | 5 |
| Range | 5 | 5 |
| Interquartile range (IQR) | 0 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.77232644 | 0.88094088 |
| Coefficient of variation (CV) | 2.0143719 | 2.0463523 |
| Kurtosis | 7.3553721 | 8.1877072 |
| Mean | 0.38340807 | 0.43049327 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.4232583 | 2.6097847 |
| Sum | 171 | 192 |
| Variance | 0.59648813 | 0.77605683 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 336 | |
| 1 | 60 | 13.5% |
| 2 | 45 | 10.1% |
| 5 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 3 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 332 | |
| 1 | 59 | 13.2% |
| 2 | 44 | 9.9% |
| 5 | 5 | 1.1% |
| 3 | 4 | 0.9% |
| 4 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 336 | |
| 1 | 60 | 13.5% |
| 2 | 45 | 10.1% |
| 3 | 1 | 0.2% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 332 | |
| 1 | 59 | 13.2% |
| 2 | 44 | 9.9% |
| 3 | 4 | 0.9% |
| 4 | 2 | 0.4% |
| 5 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 332 | |
| 1 | 59 | 13.2% |
| 2 | 44 | 9.9% |
| 3 | 4 | 0.9% |
| 4 | 2 | 0.4% |
| 5 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 336 | |
| 1 | 60 | 13.5% |
| 2 | 45 | 10.1% |
| 3 | 1 | 0.2% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 382 | 382 |
| Distinct (%) | 85.7% | 85.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.7802691 | 6.7242152 |
| Min length | 3 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3024 | 2999 |
| Distinct characters | 35 | 31 |
| Distinct categories | 5 | 5 ? |
| Distinct scripts | 2 | 2 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 333 | 332 ? |
| Unique (%) | 74.7% | 74.4% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 693 | WE/P 5735 |
| 2nd row | A/4. 39886 | 218629 |
| 3rd row | 345774 | PC 17757 |
| 4th row | 347077 | 347076 |
| 5th row | 29106 | 350046 |
| Value | Count | Frequency (%) |
| pc | 33 | 5.7% |
| c.a | 17 | 3.0% |
| a/5 | 8 | 1.4% |
| ca | 7 | 1.2% |
| 1601 | 5 | 0.9% |
| ston/o | 5 | 0.9% |
| 2 | 5 | 0.9% |
| sc/paris | 5 | 0.9% |
| ston/o2 | 5 | 0.9% |
| a/4 | 5 | 0.9% |
| Other values (407) | 479 |
| Value | Count | Frequency (%) |
| pc | 28 | 5.0% |
| c.a | 14 | 2.5% |
| a/5 | 8 | 1.4% |
| ca | 7 | 1.3% |
| 347082 | 6 | 1.1% |
| w./c | 6 | 1.1% |
| sc/paris | 5 | 0.9% |
| 3101295 | 4 | 0.7% |
| 2 | 4 | 0.7% |
| ston/o | 4 | 0.7% |
| Other values (400) | 473 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 361 | |
| 1 | 343 | |
| 2 | 296 | |
| 7 | 249 | 8.2% |
| 4 | 228 | 7.5% |
| 6 | 214 | 7.1% |
| 0 | 205 | 6.8% |
| 5 | 198 | 6.5% |
| 8 | 153 | 5.1% |
| 9 | 153 | 5.1% |
| Other values (25) | 624 |
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 353 | |
| 2 | 296 | |
| 7 | 250 | |
| 4 | 236 | |
| 6 | 213 | 7.1% |
| 0 | 204 | 6.8% |
| 5 | 183 | 6.1% |
| 9 | 164 | 5.5% |
| 8 | 154 | 5.1% |
| Other values (21) | 579 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2400 | |
| Uppercase Letter | 323 | 10.7% |
| Other Punctuation | 156 | 5.2% |
| Space Separator | 128 | 4.2% |
| Lowercase Letter | 17 | 0.6% |
| Value | Count | Frequency (%) |
| Decimal Number | 2420 | |
| Uppercase Letter | 312 | 10.4% |
| Other Punctuation | 146 | 4.9% |
| Space Separator | 113 | 3.8% |
| Lowercase Letter | 8 | 0.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 361 | |
| 1 | 343 | |
| 2 | 296 | |
| 7 | 249 | |
| 4 | 228 | |
| 6 | 214 | |
| 0 | 205 | |
| 5 | 198 | |
| 8 | 153 | |
| 9 | 153 |
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 353 | |
| 2 | 296 | |
| 7 | 250 | |
| 4 | 236 | |
| 6 | 213 | |
| 0 | 204 | |
| 5 | 183 | |
| 9 | 164 | |
| 8 | 154 |
Space Separator
| Value | Count | Frequency (%) |
| 128 |
| Value | Count | Frequency (%) |
| 113 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 108 | |
| / | 48 |
| Value | Count | Frequency (%) |
| . | 100 | |
| / | 46 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 86 | |
| P | 54 | |
| A | 46 | |
| O | 43 | |
| S | 37 | |
| N | 16 | 5.0% |
| T | 15 | 4.6% |
| W | 6 | 1.9% |
| F | 5 | 1.5% |
| I | 4 | 1.2% |
| Other values (6) | 11 | 3.4% |
| Value | Count | Frequency (%) |
| C | 75 | |
| P | 49 | |
| O | 47 | |
| A | 39 | |
| S | 34 | |
| N | 18 | 5.8% |
| T | 17 | 5.4% |
| W | 10 | 3.2% |
| Q | 7 | 2.2% |
| I | 5 | 1.6% |
| Other values (4) | 11 | 3.5% |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 5 | |
| s | 4 | |
| r | 3 | |
| i | 3 | |
| l | 1 | 5.9% |
| e | 1 | 5.9% |
| Value | Count | Frequency (%) |
| a | 2 | |
| r | 2 | |
| i | 2 | |
| s | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2684 | |
| Latin | 340 | 11.2% |
| Value | Count | Frequency (%) |
| Common | 2679 | |
| Latin | 320 | 10.7% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 3 | 361 | |
| 1 | 343 | |
| 2 | 296 | |
| 7 | 249 | |
| 4 | 228 | |
| 6 | 214 | |
| 0 | 205 | |
| 5 | 198 | |
| 8 | 153 | |
| 9 | 153 | |
| Other values (3) | 284 |
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 353 | |
| 2 | 296 | |
| 7 | 250 | |
| 4 | 236 | |
| 6 | 213 | |
| 0 | 204 | |
| 5 | 183 | |
| 9 | 164 | |
| 8 | 154 | |
| Other values (3) | 259 |
Latin
| Value | Count | Frequency (%) |
| C | 86 | |
| P | 54 | |
| A | 46 | |
| O | 43 | |
| S | 37 | |
| N | 16 | 4.7% |
| T | 15 | 4.4% |
| W | 6 | 1.8% |
| F | 5 | 1.5% |
| a | 5 | 1.5% |
| Other values (12) | 27 | 7.9% |
| Value | Count | Frequency (%) |
| C | 75 | |
| P | 49 | |
| O | 47 | |
| A | 39 | |
| S | 34 | |
| N | 18 | 5.6% |
| T | 17 | 5.3% |
| W | 10 | 3.1% |
| Q | 7 | 2.2% |
| I | 5 | 1.6% |
| Other values (8) | 19 | 5.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3024 |
| Value | Count | Frequency (%) |
| ASCII | 2999 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3 | 361 | |
| 1 | 343 | |
| 2 | 296 | |
| 7 | 249 | 8.2% |
| 4 | 228 | 7.5% |
| 6 | 214 | 7.1% |
| 0 | 205 | 6.8% |
| 5 | 198 | 6.5% |
| 8 | 153 | 5.1% |
| 9 | 153 | 5.1% |
| Other values (25) | 624 |
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 353 | |
| 2 | 296 | |
| 7 | 250 | |
| 4 | 236 | |
| 6 | 213 | 7.1% |
| 0 | 204 | 6.8% |
| 5 | 183 | 6.1% |
| 9 | 164 | 5.5% |
| 8 | 154 | 5.1% |
| Other values (21) | 579 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 181 | 183 |
| Distinct (%) | 40.6% | 41.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 33.656978 | 32.819955 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 6 | 5 |
| Zeros (%) | 1.3% | 1.1% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.2292 | 7.2292 |
| Q1 | 7.8958 | 7.8958 |
| median | 14.25415 | 14.4542 |
| Q3 | 31 | 31.275 |
| 95-th percentile | 118.31875 | 118.31875 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 23.1042 | 23.3792 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 54.525694 | 52.5157 |
| Coefficient of variation (CV) | 1.6200413 | 1.600115 |
| Kurtosis | 30.009989 | 34.233795 |
| Mean | 33.656978 | 32.819955 |
| Median Absolute Deviation (MAD) | 6.71875 | 6.7667 |
| Skewness | 4.6670045 | 4.9406518 |
| Sum | 15011.012 | 14637.7 |
| Variance | 2973.0513 | 2757.8988 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 13 | 20 | 4.5% |
| 7.8958 | 19 | 4.3% |
| 10.5 | 18 | 4.0% |
| 7.75 | 17 | 3.8% |
| 8.05 | 17 | 3.8% |
| 26 | 13 | 2.9% |
| 7.775 | 11 | 2.5% |
| 7.25 | 10 | 2.2% |
| 7.925 | 10 | 2.2% |
| 7.2292 | 9 | 2.0% |
| Other values (171) | 302 |
| Value | Count | Frequency (%) |
| 7.8958 | 21 | 4.7% |
| 13 | 20 | 4.5% |
| 7.75 | 16 | 3.6% |
| 8.05 | 15 | 3.4% |
| 26 | 15 | 3.4% |
| 10.5 | 13 | 2.9% |
| 7.775 | 11 | 2.5% |
| 7.8542 | 9 | 2.0% |
| 7.925 | 9 | 2.0% |
| 7.2292 | 8 | 1.8% |
| Other values (173) | 309 |
| Value | Count | Frequency (%) |
| 0 | 6 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| 7.125 | 2 | 0.4% |
| 7.225 | 4 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 1 | 0.2% |
| 7.225 | 5 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 1 | 0.2% |
| 7.225 | 5 |
| Value | Count | Frequency (%) |
| 0 | 6 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| 7.125 | 2 | 0.4% |
| 7.225 | 4 |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 87 | 88 |
| Distinct (%) | 87.0% | 86.3% |
| Missing | 346 | 344 |
| Missing (%) | 77.6% | 77.1% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.73 | 3.4901961 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 373 | 356 |
| Distinct characters | 19 | 19 |
| Distinct categories | 3 | 3 ? |
| Distinct scripts | 2 | 2 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 75 | 77 ? |
| Unique (%) | 75.0% | 75.5% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | D35 | B22 |
| 2nd row | A20 | B94 |
| 3rd row | T | E17 |
| 4th row | C95 | C110 |
| 5th row | B96 B98 | E33 |
| Value | Count | Frequency (%) |
| b96 | 3 | 2.5% |
| f | 3 | 2.5% |
| b98 | 3 | 2.5% |
| b5 | 2 | 1.7% |
| b49 | 2 | 1.7% |
| e25 | 2 | 1.7% |
| c27 | 2 | 1.7% |
| c25 | 2 | 1.7% |
| c23 | 2 | 1.7% |
| g73 | 2 | 1.7% |
| Other values (88) | 97 |
| Value | Count | Frequency (%) |
| g6 | 4 | 3.3% |
| b98 | 3 | 2.5% |
| b96 | 3 | 2.5% |
| f | 3 | 2.5% |
| d | 2 | 1.7% |
| c27 | 2 | 1.7% |
| c25 | 2 | 1.7% |
| c23 | 2 | 1.7% |
| b18 | 2 | 1.7% |
| e101 | 2 | 1.7% |
| Other values (89) | 95 |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 38 | 10.2% |
| 2 | 33 | 8.8% |
| 3 | 29 | 7.8% |
| C | 29 | 7.8% |
| 1 | 27 | 7.2% |
| 6 | 26 | 7.0% |
| 4 | 22 | 5.9% |
| 5 | 21 | 5.6% |
| 8 | 21 | 5.6% |
| 20 | 5.4% | |
| Other values (9) | 107 |
| Value | Count | Frequency (%) |
| B | 35 | 9.8% |
| 2 | 31 | 8.7% |
| C | 30 | 8.4% |
| 3 | 30 | 8.4% |
| 6 | 24 | 6.7% |
| 1 | 24 | 6.7% |
| 7 | 22 | 6.2% |
| 8 | 20 | 5.6% |
| 5 | 20 | 5.6% |
| 18 | 5.1% | |
| Other values (9) | 102 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 233 | |
| Uppercase Letter | 120 | |
| Space Separator | 20 | 5.4% |
| Value | Count | Frequency (%) |
| Decimal Number | 218 | |
| Uppercase Letter | 120 | |
| Space Separator | 18 | 5.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 38 | |
| C | 29 | |
| D | 17 | |
| E | 15 | 12.5% |
| F | 9 | 7.5% |
| A | 8 | 6.7% |
| G | 3 | 2.5% |
| T | 1 | 0.8% |
| Value | Count | Frequency (%) |
| B | 35 | |
| C | 30 | |
| D | 16 | |
| E | 15 | |
| F | 8 | 6.7% |
| A | 8 | 6.7% |
| G | 7 | 5.8% |
| T | 1 | 0.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 33 | |
| 3 | 29 | |
| 1 | 27 | |
| 6 | 26 | |
| 4 | 22 | |
| 5 | 21 | |
| 8 | 21 | |
| 9 | 19 | |
| 7 | 18 | |
| 0 | 17 |
| Value | Count | Frequency (%) |
| 2 | 31 | |
| 3 | 30 | |
| 6 | 24 | |
| 1 | 24 | |
| 7 | 22 | |
| 8 | 20 | |
| 5 | 20 | |
| 9 | 17 | |
| 0 | 16 | |
| 4 | 14 |
Space Separator
| Value | Count | Frequency (%) |
| 20 |
| Value | Count | Frequency (%) |
| 18 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 253 | |
| Latin | 120 |
| Value | Count | Frequency (%) |
| Common | 236 | |
| Latin | 120 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| B | 38 | |
| C | 29 | |
| D | 17 | |
| E | 15 | 12.5% |
| F | 9 | 7.5% |
| A | 8 | 6.7% |
| G | 3 | 2.5% |
| T | 1 | 0.8% |
| Value | Count | Frequency (%) |
| B | 35 | |
| C | 30 | |
| D | 16 | |
| E | 15 | |
| F | 8 | 6.7% |
| A | 8 | 6.7% |
| G | 7 | 5.8% |
| T | 1 | 0.8% |
Common
| Value | Count | Frequency (%) |
| 2 | 33 | |
| 3 | 29 | |
| 1 | 27 | |
| 6 | 26 | |
| 4 | 22 | |
| 5 | 21 | |
| 8 | 21 | |
| 20 | ||
| 9 | 19 | |
| 7 | 18 |
| Value | Count | Frequency (%) |
| 2 | 31 | |
| 3 | 30 | |
| 6 | 24 | |
| 1 | 24 | |
| 7 | 22 | |
| 8 | 20 | |
| 5 | 20 | |
| 18 | ||
| 9 | 17 | |
| 0 | 16 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 373 |
| Value | Count | Frequency (%) |
| ASCII | 356 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| B | 38 | 10.2% |
| 2 | 33 | 8.8% |
| 3 | 29 | 7.8% |
| C | 29 | 7.8% |
| 1 | 27 | 7.2% |
| 6 | 26 | 7.0% |
| 4 | 22 | 5.9% |
| 5 | 21 | 5.6% |
| 8 | 21 | 5.6% |
| 20 | 5.4% | |
| Other values (9) | 107 |
| Value | Count | Frequency (%) |
| B | 35 | 9.8% |
| 2 | 31 | 8.7% |
| C | 30 | 8.4% |
| 3 | 30 | 8.4% |
| 6 | 24 | 6.7% |
| 1 | 24 | 6.7% |
| 7 | 22 | 6.2% |
| 8 | 20 | 5.6% |
| 5 | 20 | 5.6% |
| 18 | 5.1% | |
| Other values (9) | 102 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 0 |
| Missing (%) | 0.2% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | S |
| 2nd row | S | S |
| 3rd row | S | C |
| 4th row | S | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 318 | |
| C | 88 | 19.7% |
| Q | 39 | 8.7% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 87 | 19.5% |
| Q | 40 | 9.0% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 318 | |
| c | 88 | 19.8% |
| q | 39 | 8.8% |
| Value | Count | Frequency (%) |
| s | 319 | |
| c | 87 | 19.5% |
| q | 40 | 9.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 318 | |
| C | 88 | 19.8% |
| Q | 39 | 8.8% |
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 87 | 19.5% |
| Q | 40 | 9.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 445 |
| Value | Count | Frequency (%) |
| Uppercase Letter | 446 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 318 | |
| C | 88 | 19.8% |
| Q | 39 | 8.8% |
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 87 | 19.5% |
| Q | 40 | 9.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 445 |
| Value | Count | Frequency (%) |
| Latin | 446 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 318 | |
| C | 88 | 19.8% |
| Q | 39 | 8.8% |
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 87 | 19.5% |
| Q | 40 | 9.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 445 |
| Value | Count | Frequency (%) |
| ASCII | 446 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 318 | |
| C | 88 | 19.8% |
| Q | 39 | 8.8% |
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 87 | 19.5% |
| Q | 40 | 9.0% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 545 | 546 | 0 | 1 | Nicholson, Mr. Arthur Ernest | male | 64.00 | 0 | 0 | 693 | 26.0000 | NaN | S |
| 51 | 52 | 0 | 3 | Nosworthy, Mr. Richard Cater | male | 21.00 | 0 | 0 | A/4. 39886 | 7.8000 | NaN | S |
| 286 | 287 | 1 | 3 | de Mulder, Mr. Theodore | male | 30.00 | 0 | 0 | 345774 | 9.5000 | NaN | S |
| 182 | 183 | 0 | 3 | Asplund, Master. Clarence Gustaf Hugo | male | 9.00 | 4 | 2 | 347077 | 31.3875 | NaN | S |
| 831 | 832 | 1 | 2 | Richards, Master. George Sibley | male | 0.83 | 1 | 1 | 29106 | 18.7500 | NaN | S |
| 525 | 526 | 0 | 3 | Farrell, Mr. James | male | 40.50 | 0 | 0 | 367232 | 7.7500 | NaN | Q |
| 330 | 331 | 1 | 3 | McCoy, Miss. Agnes | female | NaN | 2 | 0 | 367226 | 23.2500 | NaN | Q |
| 870 | 871 | 0 | 3 | Balkic, Mr. Cerin | male | 26.00 | 0 | 0 | 349248 | 7.8958 | NaN | S |
| 248 | 249 | 1 | 1 | Beckwith, Mr. Richard Leonard | male | 37.00 | 1 | 1 | 11751 | 52.5542 | D35 | S |
| 599 | 600 | 1 | 1 | Duff Gordon, Sir. Cosmo Edmund ("Mr Morgan") | male | 49.00 | 1 | 0 | PC 17485 | 56.9292 | A20 | C |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 745 | 746 | 0 | 1 | Crosby, Capt. Edward Gifford | male | 70.0 | 1 | 1 | WE/P 5735 | 71.0000 | B22 | S |
| 562 | 563 | 0 | 2 | Norman, Mr. Robert Douglas | male | 28.0 | 0 | 0 | 218629 | 13.5000 | NaN | S |
| 557 | 558 | 0 | 1 | Robbins, Mr. Victor | male | NaN | 0 | 0 | PC 17757 | 227.5250 | NaN | C |
| 442 | 443 | 0 | 3 | Petterson, Mr. Johan Emil | male | 25.0 | 1 | 0 | 347076 | 7.7750 | NaN | S |
| 192 | 193 | 1 | 3 | Andersen-Jensen, Miss. Carla Christine Nielsine | female | 19.0 | 1 | 0 | 350046 | 7.8542 | NaN | S |
| 263 | 264 | 0 | 1 | Harrison, Mr. William | male | 40.0 | 0 | 0 | 112059 | 0.0000 | B94 | S |
| 845 | 846 | 0 | 3 | Abbing, Mr. Anthony | male | 42.0 | 0 | 0 | C.A. 5547 | 7.5500 | NaN | S |
| 677 | 678 | 1 | 3 | Turja, Miss. Anna Sofia | female | 18.0 | 0 | 0 | 4138 | 9.8417 | NaN | S |
| 651 | 652 | 1 | 2 | Doling, Miss. Elsie | female | 18.0 | 0 | 1 | 231919 | 23.0000 | NaN | S |
| 285 | 286 | 0 | 3 | Stankovic, Mr. Ivan | male | 33.0 | 0 | 0 | 349239 | 8.6625 | NaN | C |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 539 | 540 | 1 | 1 | Frolicher, Miss. Hedwig Margaritha | female | 22.0 | 0 | 2 | 13568 | 49.5000 | B39 | C |
| 609 | 610 | 1 | 1 | Shutes, Miss. Elizabeth W | female | 40.0 | 0 | 0 | PC 17582 | 153.4625 | C125 | S |
| 80 | 81 | 0 | 3 | Waelens, Mr. Achille | male | 22.0 | 0 | 0 | 345767 | 9.0000 | NaN | S |
| 347 | 348 | 1 | 3 | Davison, Mrs. Thomas Henry (Mary E Finck) | female | NaN | 1 | 0 | 386525 | 16.1000 | NaN | S |
| 168 | 169 | 0 | 1 | Baumann, Mr. John D | male | NaN | 0 | 0 | PC 17318 | 25.9250 | NaN | S |
| 553 | 554 | 1 | 3 | Leeni, Mr. Fahim ("Philip Zenni") | male | 22.0 | 0 | 0 | 2620 | 7.2250 | NaN | C |
| 762 | 763 | 1 | 3 | Barah, Mr. Hanna Assi | male | 20.0 | 0 | 0 | 2663 | 7.2292 | NaN | C |
| 700 | 701 | 1 | 1 | Astor, Mrs. John Jacob (Madeleine Talmadge Force) | female | 18.0 | 1 | 0 | PC 17757 | 227.5250 | C62 C64 | C |
| 860 | 861 | 0 | 3 | Hansen, Mr. Claus Peter | male | 41.0 | 2 | 0 | 350026 | 14.1083 | NaN | S |
| 267 | 268 | 1 | 3 | Persson, Mr. Ernst Ulrik | male | 25.0 | 1 | 0 | 347083 | 7.7750 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 659 | 660 | 0 | 1 | Newell, Mr. Arthur Webster | male | 58.0 | 0 | 2 | 35273 | 113.2750 | D48 | C |
| 493 | 494 | 0 | 1 | Artagaveytia, Mr. Ramon | male | 71.0 | 0 | 0 | PC 17609 | 49.5042 | NaN | C |
| 408 | 409 | 0 | 3 | Birkeland, Mr. Hans Martin Monsen | male | 21.0 | 0 | 0 | 312992 | 7.7750 | NaN | S |
| 55 | 56 | 1 | 1 | Woolner, Mr. Hugh | male | NaN | 0 | 0 | 19947 | 35.5000 | C52 | S |
| 246 | 247 | 0 | 3 | Lindahl, Miss. Agda Thorilda Viktoria | female | 25.0 | 0 | 0 | 347071 | 7.7750 | NaN | S |
| 355 | 356 | 0 | 3 | Vanden Steen, Mr. Leo Peter | male | 28.0 | 0 | 0 | 345783 | 9.5000 | NaN | S |
| 266 | 267 | 0 | 3 | Panula, Mr. Ernesti Arvid | male | 16.0 | 4 | 1 | 3101295 | 39.6875 | NaN | S |
| 604 | 605 | 1 | 1 | Homer, Mr. Harry ("Mr E Haven") | male | 35.0 | 0 | 0 | 111426 | 26.5500 | NaN | C |
| 9 | 10 | 1 | 2 | Nasser, Mrs. Nicholas (Adele Achem) | female | 14.0 | 1 | 0 | 237736 | 30.0708 | NaN | C |
| 734 | 735 | 0 | 2 | Troupiansky, Mr. Moses Aaron | male | 23.0 | 0 | 0 | 233639 | 13.0000 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||